AITopics | shortcut connection

Although recently proposed parameter-efficient transfer learning (PETL) techniques allowupdating asmallsubsetofparameters (e.g. This is because the gradient computation for the trainable parameters still requires backpropagation through thelargepre-trained backbone model.

machine learning, natural language, urlhttp, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Understanding the Importance of Shortcut Connections in Residual Networks

Neural Information Processing SystemsDec-25-2025, 14:28:35 GMT

Residual Network (ResNet) is undoubtedly a milestone in deep learning. ResNet is equipped with shortcut connections between layers, and exhibits efficient training using simple first order algorithms. Despite of the great empirical success, the reason behind is far from being well understood. In this paper, we study a two-layer non-overlapping convolutional ResNet. Training such a network requires solving a non-convex optimization problem with a spurious local optimum. We show, however, that gradient descent combined with proper normalization, avoids being trapped by the spurious local optimum, and converges to a global optimum in polynomial time, when the weight of the first layer is initialized at 0, and that of the second layer is initialized arbitrarily in a ball. Numerical experiments are provided to support our theory.

name change, residual network, shortcut connection, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Add feedback

Learning Deep Bilinear Transformation for Fine-grained Image Representation

Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo

Neural Information Processing SystemsOct-3-2025, 06:27:10 GMT

Neural Information Processing Systems http://nips.cc/

antic group, bilinear transformation, transformation, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > Monroe County > Rochester (0.04)
North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.99)

Add feedback

Towards Understanding the Importance of Shortcut Connections in Residual Networks

Tianyi Liu, Minshuo Chen, Mo Zhou, Simon S. Du, Enlu Zhou, Tuo Zhao

Neural Information Processing SystemsOct-3-2025, 01:01:31 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Efficient Forward Architecture Search

Hanzhang Hu, John Langford, Rich Caruana, Saurajit Mukherjee, Eric J. Horvitz, Debadeepta Dey

Neural Information Processing SystemsOct-2-2025, 22:42:48 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

Supplementary Materials: Rethinking Alignment in Video Super-Resolution Transformers

Neural Information Processing SystemsAug-19-2025, 16:13:01 GMT

The proposed patch alignment method can also be applied to the recurrent VSR framework. VSR and have achieved the state-of-the-art performance. Transformer backbone, we can easily build a recurrent VSR Transformer. Alignment modules are not absent in the existing recurrent methods. The feature size is set to 100, and the number of attention heads is 4. The baseline is the original BasicVSR++ model that uses FGDC and CNN backbone.

artificial intelligence, machine learning, transformer, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.49)
Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Filters

Collaborating Authors

shortcut connection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Towards Understanding the Importance of Shortcut Connections in Residual Networks

ea4d65c59073e8faf79222654d25fbe2-Supplemental-Conference.pdf

Efficient Forward Architecture Search

818f4654ed39a1c147d1e51a00ffb4cb-Paper.pdf

54801e196796134a2b0ae5e8adef502f-Paper-Conference.pdf

Towards Understanding the Importance of Shortcut Connections in Residual Networks

Learning Deep Bilinear Transformation for Fine-grained Image Representation

Towards Understanding the Importance of Shortcut Connections in Residual Networks

Efficient Forward Architecture Search

Supplementary Materials: Rethinking Alignment in Video Super-Resolution Transformers